2025-04-18 08:01:41.AIbase.17.3k
ByteDance Doubao Open-Source Seed Agent Model UI-TARS-1.5
The ByteDance Doubao large model team announced the open-sourcing of UI-TARS-1.5, an open-source multimodal agent built on a vision-language model capable of efficiently executing various tasks in a virtual world. The model achieved state-of-the-art (SOTA) performance on seven typical GUI (Graphical User Interface) benchmark evaluations and demonstrated, for the first time, its long-term reasoning capabilities in games and interactive capabilities in open spaces. This open-source project marks a significant advancement in multimodal agent technology for GUIs.